Search Results: "tytso"

Theodore Ts'o: New toy: iPod Touch (2nd Generation)

With the price drop, I finally decided to get a 32GB iPod Touch, and I have to admit, Apple has done a really nice job. Its decisions about which applications it decides to arbitrarily blacklist from its AppStore (either now or without warning in the future) is evil, of course, but I don’t plan to develop on a locked-in platform such iPod/iPhone, so that’s not a problem. And of course, given AT&T’s evil customer service, I won’t be getting an iPhone any time soon (life’s too short to play cat and mouse with Apple’s cell phone locking games), this was probably my only opportunity in the short time to play with the iPhone/iPod touch’s e-mail application. My reaction? Apple’s programmers and UI designers are very, very, good. As Jim Zemlin has pointed out, if Apple’s locked-down platform is a prison, it’s a velvet lined one. And I’m not one to like living in prisons, even if they are gorgeously appointed with 50 inch flat panel TV screens and an ocean view. But still, I’m happy with the iPod Touch; it’s relatively cheap for its functionality as an mp3 player, with web and e-mail access via wifi as a bonus. The only thing that is important is that I not harbor any illusions that this is not an open device, but a locked down platform. So that means I won’t invest any time in developing for it; nor will I invest any money in any for-pay applications (after all, Apple has the right to disable them at any time, for any reason they deem good and proper). And if any one really thinks Apple wouldn’t do anything like this, remember, they’re the company that just filed a patent on implementing DRM on clothing and sneakers! Still, it would be nice if an open platform, such as the Nokia N800 had the functionality and usability of the iPod touch. One day, perhaps…

Theodore Ts'o: Fast ext4 fsck times

This wasn’t one of the things we were explicitly engineering for when were designing the features that would go into ext4, but one of the things which we’ve found as a pleasant surprise is how much more quickly ext4 filesystems can be checked. Ric Wheeler reported some really good fsck times that were over ten times better than ext3 using filesystems generated using what was admittedly a very artificial/synthetic benchmark. During the past six weeks, though, I’ve been using ext4 on my laptop, and I’ve seen very similar results. This past week, while at LinuxWorld, I’ve been wowing people with the following demonstration. Using an LVM snapshot, I ran e2fsck on the root filesystem on my laptop. So using a 128 gigabyte filesystem, on a laptop drive, this is what people who got to see my demo saw:

e2fsck 1.41.0 (10-Jul-2008)
Pass 1: Checking inodes, blocks, and sizes
Pass 1: Memory used: 3440k/12060k (3311k/130k), time: 17.82/ 5.52/ 1.11
Pass 1: I/O read: 233MB, write: 0MB, rate: 13.08MB/s
Pass 2: Checking directory structure
Pass 2: Memory used: 3440k/13476k (3311k/130k), time: 41.47/ 2.16/ 3.30
Pass 2: I/O read: 274MB, write: 0MB, rate: 6.61MB/s
Pass 3: Checking directory connectivity
Peak memory: Memory used: 3440k/14504k (3311k/130k), time: 59.88/ 7.75/ 4.42
Pass 3: Memory used: 3440k/13476k (3311k/130k), time:  0.04/ 0.02/ 0.01
Pass 3: I/O read: 1MB, write: 0MB, rate: 27.38MB/s
Pass 4: Checking reference counts
Pass 4: Memory used: 3440k/6848k (3310k/131k), time:  0.25/ 0.24/ 0.00
Pass 4: I/O read: 0MB, write: 0MB, rate: 0.00MB/s
Pass 5: Checking group summary information
Pass 5: Memory used: 3440k/5820k (3310k/131k), time:  3.13/ 1.85/ 0.10
Pass 5: I/O read: 5MB, write: 0MB, rate: 1.60MB/s
  779726 inodes used (9.30%)
       1 non-contiguous inode (0.0%)
         # of inodes with ind/dind/tind blocks: 719/712/712
22706429 blocks used (67.67%)
       0 bad blocks
       4 large files
  673584 regular files
   58903 directories
    1304 character device files
    4575 block device files
      11 fifos
    1818 links
   41336 symbolic links (32871 fast symbolic links)
       4 sockets
--------
  781535 files
Memory used: 3440k/5820k (3376k/65k), time: 63.35/ 9.86/ 4.54
I/O read: 511MB, write: 1MB, rate: 8.07MB/s

How does this compare against ext3? To answer that, I copied my entire ext4 file system to an equivalently sized partition formatted for use with ext3. This comparison is a little unfair since the ext4 file system has six weeks of aging on it, where as the ext3 filesystem was a fresh copy, so the directories are a bit more optimized. That probably explains the slightly better times in pass 2 for the ext3 file system. Still, it was no contest; the ext4 file system was almost seven times faster to check using e2fsck compared to the ext3 file system. Fsck on an ext4 filesystem is fast!

Comparison of e2fsck times on an 128GB partition
Pass	ext3					ext4
	time (s)			I/O		time (s)			I/O
	real	user	system	MB read	MB/s	real	user	system	MB read	MB/s
1	382.63	18.06	14.99	2376	6.21	17.82	5.52	1.11	233	13.08
2	31.76	1.76	2.13	303	9.54	41.47	2.16	3.3	274	6.61
3	0.03	0.01	0	1	31	0.04	0.02	0.01	1	27.38
4	0.2	0.2	0	0	0	0.25	0.24	0	0	0
5	9.86	1.26	0.22	5	0.51	3.13	1.85	0.1	5	1.6
Total	424.81	21.36	17.34	2685	6.32	63.35	9.86	4.54	511	8.07

Mark Brown: Whos in charge?

Apparently the use of the terms describing free software as organic or non-organic depending on the extent to which the piece of software concerned is controlled or driven by a single company wound a few people up, partly due to the strong value judgements that the terms tend to imply. The terms I found myself using in a conversation about web frameworks with some Scotland on Rails attendees were commercial and corporate. These apply better in the web sphere where the business side of things is much more to the fore than it is in most free software work but the principle is there. Money floating around isn’t an issue, it’s the extent to which you have to deal with a particular company to get things done that makes the difference.

Theodore Ts'o: Organic vs. Non-Organic Open Source, Revisited

There’s been some controversy generated over my use of the terminology of “Organic” and “Non-Organic” Open Source. Asa Dotzler noted that it wasn’t Mozilla’s original intent to “make a distinction between how Mozilla does open source and how others do open source”. Nessance complained that he didn’t like the term “Non-Organic”, because it was “raw and vague - is it alien, poison, silicon-based?” and suggested instead the term “Synthetic Open Source”, referencing a paper by Siobh n O Mahony, ” What makes a project open source? Migrating from organic to synthetic communities”. Nessance referenced a series of questions and answers by Stephen O’ Grady from Red Monk, where he claimed the distinction between the two doesn’t matter. (Although given that Sun is a paying customer of Red Monk, Stephen admits that this might have influenced his thinking and so he might be “brainwashed” :-). So let’s take some of these issues in reverse order. Does the distinction matter? After all, if the distinction doesn’t matter, then there’s no reason to create or define specialized terminology to describe the difference. Certainly, Brian Aker, a senior technologist from MySQL, thinks it does, as do folks like me and Amanda McPherson and Mike Dolan; but does it really? Are we just saying that because we want to take a cheap shot at Sun? Well, to answer that, let’s go back and ask the question, “Why is Open Source a good thing in the first place?” It’s gotten to the point where people just assume that it’s a good thing, because everybody says it is. But if we go back to first principals maybe it will become much clearer why this dinction is so important. Consider the Apache web server; it was able to completely dominate the web server market, easily besting all of its proprietary competitors, including the super-deep-pocketed Microsoft. Why? It won because a large number of volunteers were able to collaborate together to create a very fully featured product, using a “stone soup” model where each developer “scratched their own itch”. Many, if not most, of these volunteers were compensated by their employers for their work. Since their employers were not in the web server business, but instead needed a web server as means (a critical means, to be sure) to pursue their business, there was no economic reason not to let their engineers contribute their improvements back to the Apache project. Indeed, it was cheaper to let their engineers work on Apache collaboratively than it was to purchase a product that would be less suited for their needs. In other words, it was a collective “build vs. buy” decision, with the twist that because a large number of companies were involved in the collaboration, it was far, far cheaper than the traditional “build” option. This is a powerful model, and the fact that Sun originally asked Roy Felding from the Apache Foundation to assist in forming the Solaris community indicates that at least some people in Sun appreciated why this was so important. There are other benefits of having code released under the Open Source license, such as the ability for others to see the implementation details of your operating system — but in truth, Sun had already made the Source Code for Solaris available for a nominal fee years before. And, of course, there are plenty of arguments over the exact licensing terms that should be used, such as GPLv2, GPLv3, CDDL, the CPL, MPL, etc., but sometimes those arguments can be a distraction from the central issue. While the legal issues that arise from the choice of license are important, at the end of the day, the most crucial issue is the development community. It is the strength and the diversity of the development community which is the best indicator for the health and the well-being of an Open Source project. But what about end-users, I hear people cry? End users are important, to the extent that they provide ego-strokes to the developers, and to the extent that they provide testing and bug reports to the developers, and to the extent that they provide an economic justification to companies who employ open source developers to continue to do so. But ultimately, the effects of end-users on an open source project is only in a very indirect way. Moreover, if you ask commercial end users what they value about Open Source, a survey by Computer Economics indicated that the number one reason why customers valued open source was “reduced dependence on software vendors”, which end users valued 2 to 1 over “lower total cost of ownership”. (Which is why Sun Salescritters who were sending around TCO analysis comparing 24×7 phone support form Red Hat with Support-by-email from Sun totally missed the point.) What’s important to commercial end users is that they be able to avoid the effects of vendor lock-in, which implies that if all of the developers are employed by one vendor, it doesn’t provide the value the end users were looking for. This is why whether a project’s developers are dominated by employees from a single company is so important. The license under which the code is released is merely just the outward trappings of an open source project. What’s really critical is the extent to which the development costs are shared across a vast global community of developers who have many different means of support. This saves costs to the companies who are using a product being developed in such a fashion; it gives choice to customers about whether they can get their support from company A or company B; programmers who don’t like the way things are going at one company have an easier time changing jobs while still working on the same project; it’s a win-win-win scenario. In contrast, if a project decides to release its code under an open source license, but nearly all the developers remain employed by a single company, it doesn’t really change the dynamic compared to when the project was previously under a closed-source license. It is a necessary but not sufficient step towards attracting outside contributors, and eventually migrating towards having a true open source development community. But if those further steps are not taken, the hopes that users will think that some project is “cool” because it is under an open-source license will ultimately be in vain. The “Generation Y”/Millennial Generation in particular are very sensitive indeed to Astroturfing-style marketing tactics. Ok, so this is why the distinction matters. Given that it does, what terms shall we use? I still like “Organic” vs “Non-organic”. While it may not have been intended by the Mozilla Foundation, the description in their web page, “only a small percentage of whom are actual employees [of the Mozilla Foundation]“, is very much what I and others have been trying to describe. And while I originally used the description “Projects which have an Open Source Development Community” vs “Projects with an Open Source License but which are dominated by employees from a single company”, I think we can all agree these are very awkward. We need a better shorthand. When Brian Aker from MySQL suggested “Organic” vs “Non-Organic” Open Source, and I think those terms work well. If some folks think that “Non-Organic” is somehow pejorative (hey, at least we didn’t say “genetically modified Open Source” :-), I suppose we could use Synthetic Open Source. I’m not really convinced that is any much more appetizing, myself, however. So what would be better terms to use? Please give me some suggestions, and maybe we can come up with a better set of words that everyone is happy with.

Theodore Ts'o: Organic vs. Non-Organic Open Source, Revisited

There’s been some controversy generated over my use of the terminology of “Organic” and “Non-Organic” Open Source. Asa Dotzler noted that it wasn’t Mozilla’s original intent to “make a distinction between how Mozilla does open source and how others do open source”. Nessance complained that he didn’t like the term “Non-Organic”, because it was “raw and vague - is it alien, poison, silicon-based?” and suggested instead the term “Synthetic Open Source”, referencing a paper by Siobh n O Mahony, ” What makes a project open source? Migrating from organic to synthetic communities”. Nessance referenced a series of questions and answers by Stephen O’ Grady from Red Monk, where he claimed the distinction between the two doesn’t matter. (Although given that Sun is a paying customer of Red Monk, Stephen admits that this might have influenced his thinking and so he might be “brainwashed” :-). So let’s take some of these issues in reverse order. Does the distinction matter? After all, if the distinction doesn’t matter, then there’s no reason to create or define specialized terminology to describe the difference. Certainly, Brian Aker, a senior technologist from MySQL, thinks it does, as do folks like me and Amanda McPherson and Mike Dolan; but does it really? Are we just saying that because we want to take a cheap shot at Sun? Well, to answer that, let’s go back and ask the question, “Why is Open Source a good thing in the first place?” It’s gotten to the point where people just assume that it’s a good thing, because everybody says it is. But if we go back to first principals maybe it will become much clearer why this dinction is so important. Consider the Apache web server; it was able to completely dominate the web server market, easily besting all of its proprietary competitors, including the super-deep-pocketed Microsoft. Why? It won because a large number of volunteers were able to collaborate together to create a very fully featured product, using a “stone soup” model where each developer “scratched their own itch”. Many, if not most, of these volunteers were compensated by their employers for their work. Since their employers were not in the web server business, but instead needed a web server as means (a critical means, to be sure) to pursue their business, there was no economic reason not to let their engineers contribute their improvements back to the Apache project. Indeed, it was cheaper to let their engineers work on Apache collaboratively than it was to purchase a product that would be less suited for their needs. In other words, it was a collective “build vs. buy” decision, with the twist that because a large number of companies were involved in the collaboration, it was far, far cheaper than the traditional “build” option. This is a powerful model, and the fact that Sun originally asked Roy Felding from the Apache Foundation to assist in forming the Solaris community indicates that at least some people in Sun appreciated why this was so important. There are other benefits of having code released under the Open Source license, such as the ability for others to see the implementation details of your operating system — but in truth, Sun had already made the Source Code for Solaris available for a nominal fee years before. And, of course, there are plenty of arguments over the exact licensing terms that should be used, such as GPLv2, GPLv3, CDDL, the CPL, MPL, etc., but sometimes those arguments can be a distraction from the central issue. While the legal issues that arise from the choice of license are important, at the end of the day, the most crucial issue is the development community. It is the strength and the diversity of the development community which is the best indicator for the health and the well-being of an Open Source project. But what about end-users, I hear people cry? End users are important, to the extent that they provide ego-strokes to the developers, and to the extent that they provide testing and bug reports to the developers, and to the extent that they provide an economic justification to companies who employ open source developers to continue to do so. But ultimately, the effects of end-users on an open source project is only in a very indirect way. Moreover, if you ask commercial end users what they value about Open Source, a survey by Computer Economics indicated that the number one reason why customers valued open source was “reduced dependence on software vendors”, which end users valued 2 to 1 over “lower total cost of ownership”. (Which is why Sun Salescritters who were sending around TCO analysis comparing 24×7 phone support form Red Hat with Support-by-email from Sun totally missed the point.) What’s important to commercial end users is that they be able to avoid the effects of vendor lock-in, which implies that if all of the developers are employed by one vendor, it doesn’t provide the value the end users were looking for. This is why whether a project’s developers are dominated by employees from a single company is so important. The license under which the code is released is merely just the outward trappings of an open source project. What’s really critical is the extent to which the development costs are shared across a vast global community of developers who have many different means of support. This saves costs to the companies who are using a product being developed in such a fashion; it gives choice to customers about whether they can get their support from company A or company B; programmers who don’t like the way things are going at one company have an easier time changing jobs while still working on the same project; it’s a win-win-win scenario. In contrast, if a project decides to release its code under an open source license, but nearly all the developers remain employed by a single company, it doesn’t really change the dynamic compared to when the project was previously under a closed-source license. It is a necessary but not sufficient step towards attracting outside contributors, and eventually migrating towards having a true open source development community. But if those further steps are not taken, the hopes that users will think that some project is “cool” because it is under an open-source license will ultimately be in vain. The “Generation Y”/Millennial Generation in particular are very sensitive indeed to Astroturfing-style marketing tactics. Ok, so this is why the distinction matters. Given that it does, what terms shall we use? I still like “Organic” vs “Non-organic”. While it may not have been intended by the Mozilla Foundation, the description in their web page, “only a small percentage of whom are actual employees [of the Mozilla Foundation]“, is very much what I and others have been trying to describe. And while I originally used the description “Projects which have an Open Source Development Community” vs “Projects with an Open Source License but which are dominated by employees from a single company”, I think we can all agree these are very awkward. We need a better shorthand. When Brian Aker from MySQL suggested “Organic” vs “Non-Organic” Open Source, and I think those terms work well. If some folks think that “Non-Organic” is somehow pejorative (hey, at least we didn’t say “genetically modified Open Source” :-), I suppose we could use Synthetic Open Source. I’m not really convinced that is any much more appetizing, myself, however. So what would be better terms to use? Please give me some suggestions, and maybe we can come up with a better set of words that everyone is happy with.

Theodore Ts'o: Links 2008-04-25

The Open Source Commands: Really good ideas that companies should take to heart.
Open Source Commandments II: Passover Penguins: More really good ideas, especially for companies like Sun…
Did Canonical Just Get Punked by Red Hat and Novell?: Interesting thoughts about Linux desktop strategies
rPath to OEM SUSE Linux Enterprise Server from Novell for Appliances: I know a bunch of the folks at rPath, and I very much respect their technology; I think this is a very good thing for them.
Does Microsoft CEO Steve Ballmer need an intervention?: Does anyone think a Microsoft/Yahoo merger makes sense besides Mr. Ballmer?

Theodore Ts'o: Organic vs. Non-organic Open Source

Brian Aker dropped by and replied to my previous essay by making the following comment:

I believe you are hitting the nail on the organic vs nonorganic open source. I do not believe we have a model for going from one to the other. Linux and Apache both have very different models for contribution but I don t believe either are really optimized at this point. Optimization to me would lead to a system of less priests and more inclusion.

I made an initial reply as comment, and then decided it was so long that I should promote it to a top-level post. I assume that when Brian talks about “organic open source” what he means is what I was calling an “open source development community”. Some googling turned up the following definition from Mozilla Firefox’s organic software page: “Our most well-known product, Firefox, is created by an international movement of thousands, only a small percentage of whom are actual employees.” This puts it in contrast with “non-organic” software, where all or nearly all of the developers are employed by one company. (And anyone who proves talented at adding features to that source base soon gets a job offer by that one company.

By that definition we can certainly see projects like Wine, Mysql, Ghostscript (at one time), and others as fitting into that model, and being quite successful. There’s nothing really wrong with the non-organic software model, although many of them have struggled to make enough money when competing with pure proprietary softare competitors, with MySQL perhaps being the exception which proves the rule. In most of these cases, though, the project started more as an organic open source, and then transitioned into the non-organic model when there was a desire to monetize the project and/or when the open source programmers decided that it would be nice if they could turn their avocation into a vocation, and let their hobby put food on the family table. Solaris, of course, is doing something else quite different, though. They are trying to make the transition from a proprietary customer/supplier relationship to trying to develop an Open Source community and what John s candidate statement pointed out is that they weren t really interested in creating an organic open source developer community at all, but they wanted the fruits of an open source community with plenty of application developers, end-users, etc., all participating in that community. We don t have a lot of precedent for projects who try to go in this direction, but I suspect they are skipping a step when they try to go to the end step without bothering to try to make themselves open to outside developers. And by continuing to act like a corporation, they end up shooting themselves in the foot. For example, the OpenSolaris license still prohibits people from publishing benchmarks or comparisons with other operating systems. Very common in closed-source operating systems and databases, but it discourages people from even trying to make things better, both within and outside of the Open Solaris core team. Instead, they respond to posts like David Miller s with Have you ever kissed a girl? . (Thanks, Simon, for that quote; I had seen it before, but not for a while, and it pretty well sums up the sheer arrogance of the Open Solaris development team.) So while Linux may not be completely optimized in terms of less priests and more inclusion, at least over 1200 developers contributed to 2.6.25 during its development cycle. Compared to that, Open Solaris is positively dominated by high priests and with a you may not touch the holy-of-holies attitude; heck, they won t even allow you to compare them to other religions without branding you a heretic and suing you for licensing violations!

Theodore Ts'o: Organic vs. Non-organic Open Source

Brian Aker dropped by and replied to my previous essay by making the following comment:

I believe you are hitting the nail on the organic vs nonorganic open source. I do not believe we have a model for going from one to the other. Linux and Apache both have very different models for contribution but I don t believe either are really optimized at this point. Optimization to me would lead to a system of less priests and more inclusion.

I made an initial reply as comment, and then decided it was so long that I should promote it to a top-level post. I assume that when Brian talks about “organic open source” what he means is what I was calling an “open source development community”. Some googling turned up the following definition from Mozilla Firefox’s organic software page: “Our most well-known product, Firefox, is created by an international movement of thousands, only a small percentage of whom are actual employees.” This puts it in contrast with “non-organic” software, where all or nearly all of the developers are employed by one company. (And anyone who proves talented at adding features to that source base soon gets a job offer by that one company.

By that definition we can certainly see projects like Wine, Mysql, Ghostscript (at one time), and others as fitting into that model, and being quite successful. There’s nothing really wrong with the non-organic software model, although many of them have struggled to make enough money when competing with pure proprietary softare competitors, with MySQL perhaps being the exception which proves the rule. In most of these cases, though, the project started more as an organic open source, and then transitioned into the non-organic model when there was a desire to monetize the project and/or when the open source programmers decided that it would be nice if they could turn their avocation into a vocation, and let their hobby put food on the family table. Solaris, of course, is doing something else quite different, though. They are trying to make the transition from a proprietary customer/supplier relationship to trying to develop an Open Source community and what John s candidate statement pointed out is that they weren t really interested in creating an organic open source developer community at all, but they wanted the fruits of an open source community with plenty of application developers, end-users, etc., all participating in that community. We don t have a lot of precedent for projects who try to go in this direction, but I suspect they are skipping a step when they try to go to the end step without bothering to try to make themselves open to outside developers. And by continuing to act like a corporation, they end up shooting themselves in the foot. For example, the OpenSolaris license still prohibits people from publishing benchmarks or comparisons with other operating systems. Very common in closed-source operating systems and databases, but it discourages people from even trying to make things better, both within and outside of the Open Solaris core team. Instead, they respond to posts like David Miller s with Have you ever kissed a girl? . (Thanks, Simon, for that quote; I had seen it before, but not for a while, and it pretty well sums up the sheer arrogance of the Open Solaris development team.) So while Linux may not be completely optimized in terms of less priests and more inclusion, at least over 1200 developers contributed to 2.6.25 during its development cycle. Compared to that, Open Solaris is positively dominated by high priests and with a you may not touch the holy-of-holies attitude; heck, they won t even allow you to compare them to other religions without branding you a heretic and suing you for licensing violations!

Theodore Ts'o: What Sun was trying to do with Open Solaris

I was recently checking to see what, if any follow-up there had been from Sun’s ham-handed handling of the Open Solaris Trademark, and I ran across this very interesting comment from Jon Plocher’s Candidate Statement for the Open Solaris Governing Board:

“I also think there was a misunderstanding about what Sun desired when it launched the community (in part) to encourage developers to adopt and use Solaris. My take is that, while there *is* value in getting more kernel, driver and utility developers contributing to and porting the (open) Solaris operating system, there is significantly *more* value in having a whole undivided ecosystem based on a compatible set of distributions, where application developers, university students, custom distro builders and users are all able to take advantage of each other’s work. Put these two things together, and you can see Sun’s predicament. Sun *wanted* a community that empowered application developers, but *got* a community aimed squarely at kernel hackers. Whether you see this as the “kernel.org -vs- Ubuntu” fight, or the “fully open -vs- MySQL model” argument, in my opinion, it all is simply a reflection of the above mismatched expectations.”

So that explains why it’s take three long years to try to get basic open source development tools (such as putting Open Solaris source code in a distributed SCM located outside of the Sun firewall) for Open Solaris. It was never was Sun’s intention to try to promote a kernel engineering community, or at least, it was certainly not a high priority for them to do so. This can be shown by the fact that as of this writing they still are using the incredibly clunky requester/sponsor system for getting patches into Solaris; setting up a git or mercurial server is not rocket science. This lack explains why Linus gets more contributions while brushing his teeth than Open Solaris gets in a week. So if you run into a Sun salescritter or a Sun CEO claiming that OpenSolaris is just like Linux, it’s not. Fundamentally, Open Solaris has been released under a Open Source license, but it is not an Open Source development community. Maybe it will be someday, as some Sun executives have claimed, but it’s definitely not a priority by Sun; if it was, it would have been done before now. And why not? After all, they are getting all of the marketing benefit of claiming that Solaris is “just like Linux”, without having to deal with any of the messy costs of working with an outside community. As a tactical measure, astroturfing is certainly a valid marketing trick. But after three years, the excuse of “just you wait a little longer, we’re just trying to figure this open source community stuff out”, is starting to wear a little thin. Furthermore, if (as Jon Ploncher claims) this was about “empowering application programmers”, why was it that Sun’s first act was to trumpet how wonderful it was to release the Solaris source code under a Open Source license? This only seems to make sense if the Open Solaris initiative was really a cynical marketing tactic to try to save Solaris from being viewed as irrelevant. If that was Sun’s intention, I think it is fair to say that from a marketing point of view, the tactic has been at least partially successful — although as Jon has admitted, the goal of creating a full community with application developers, university students, and so on, hasn’t materialized for Open Solaris. Sun has the dream; the Linux community is living it. However, from business standpoint, I wonder if Sun will really be able to sustain their Solaris engineering team if they will really be doing all of the work themselves, and outside contributions continue at the rate of 0.6 patches per day. After all, the margins when you are selling low-cost AMD servers are much lower than when you are selling ber-expensive SPARC servers. With Linux, we have a major advantage in that kernel improvements are coming from multiple companies in the ecosystem, instead of being paid for by a single company. And given that 70-80% of Sun’s AMD servers are running Linux, not Solaris, it’s not clear how Sun justifies their Solaris engineering costs to their shareholders. Furthermore, if Solaris on x86_64 were to actually take off, there’s nothing to stop competitors from selling Solaris support — except the competitors won’t have to pay the engineering costs to maintain and improve Solaris, so they would be able to provide the support much more cheaply than Sun could. So while Sun’s marketing tactics have kept Solaris alive in some verticals, I have to question how successful Sun will be in the long-term.

Theodore Ts'o: AT&T: Customer support horrors

This morning, I just wasted two hours of my life trying to deal with a bill with AT&T. I am a work-at-home employee, and my company has a contract with AT&T so that when I deal 1-700-xxx-xxxx, I can reach the internal corporate phone network. In addition, long distance calls on my home office line are billed to the company at the pre-negotiated corporate rates. I also had a (long-dormant) AT&T long distance account, dating from before I started working at this company. Starting at the beginning of the year, that account grew a $18 monthly fee. When I tried to make it go away, the AT&T consumer side of the house said that according to Verizon, that was because I had my long distance service through AT&T. Which was true, in a sense — AT&T was providing my service, but through a corporate account. But the consumer side of AT&T didn’t understand that, and were in my opinion, willfully ignorant. Several phone calls later, I got the 1-800 number for the AT&T corporate side of the house, and those folks said they couldn’t look at consumer/personal AT&T accounts, and implied it was my fault that I had both accounts on the line. (This took over an hour while the support person tried multiple things, all of which was not helpful at all.) I finally googled for AT&T CEO’s office, and found a number on the consumerist.com web site that claimed to be the AT&T CEO’s office. It had long since been directed to a help center, but after I told my tale, I was quickly transferred to an executive customer support person, who was able to fix the problem in ten minutes. The only questions remaining are: 1) Why can’t the different parts of AT&T talk to one another?
2) Will this problem really be solved, or will I see another bill in a month or two and have to spend more time dealing with this mess all over again?
3) When will VOIP services put AT&T long distance out of its misery? (Given the absolute frustration of this morning, this can’t happen soon enough.)
4) Will AT&T compensate me for two hours of frustration and of my life that I will never get back. (Not bloody likely.) One thing is for certain, it will be a long, long, LONG time before I will ever voluntarily choose AT&T to provide service for just about anything. Even an iPhone wouldn’t be enough inducement….

John Goerzen: Revisiting Git and Mercurial

Exactly one year ago today, I wrote about Git, Mercurial, and Bzr. I have long been interested in VCS, and looked at the three main DVCS systems back then.

A Quick Review

Mercurial was, and for the moment, remains, my main VCS. Bzr remains really uninteresting; I don't see it offering anything compelling that Mercurial or Git can't do. My Git gripes mainly revolved around its interface and documentation. Also, I do have Windows people using my software, and need a plausible solution for them, even though I personally do no development on that platform.

Ted Tso wrote his own article in reply to mine, noting that the Git community had identified many of the same things I had ans was working on them.

I followed up to Ted with:

... So if Ted's right, and a year from now git is easier to use, better documented, more featureful, and runs well on Windows, it won't be that hard to switch over and preserve history. Ted's the sort of person that usually is right, so maybe I should starting looking at hg2git right now.

So I guess that means it's time to start looking at Git again.

This is rather rambly, I know. It's late and I want to get these thoughts down before going to sleep...

Looking at Git

I started at the Git wikipedia page for an overview of the software. It linked to two Google Tech Talks about Git: one by Linus Torvalds and another by Randal Schwartz. Of the two, I found Linus' more entertaining and Randal's more informative. Linus' point that CVS is fundamentally broken, and that SVN trying to be "a better CVS" (an early goal of svn, at least) means it too is fundamentally broken, strikes me as quite sound.

One other interesting tidbit I picked up is that git can show you where functions have moved from one file to another, thanks to its rename-detection heuristic. That sounds really sweet, and is the best reason I've yet heard for Git's stubborn refusal to track renames.

The Landscape

I've been following Mercurial and Darcs somewhat, and not paying much attention to Git. Mercurial has been adding small features, and is nearing version 1.0. Darcs has completed a major overhaul both of its repository format and internal algorithms and is nearing version 2.0, and appears to have finally killed the doppleganger (aka conflict spinlock) bug for good.

Git, meanwhile, seems to have made strides in usability and documentation in its 1.5.x versions.

One thing particularly interesting to me is: what projects are using the different VCSs. High-profile projects now using Mercurial include OpenSolaris, OpenJDK (Java 7), and Mozilla's projects. Git has, of course, the Linux kernel. It also has just about everything associated with freedesktop.org, including X. Also a ton of Unixy stuff.

Both Mercurial and Git communities are working on TortoiseHg/TortoiseGit types of GUIs for Windows users. Git appears to have a sane Windows port now as well, putting it on pretty much even footing with Mercurial and Darcs there. However, I didn't spot anything with obvious Windows ties in the Git "what projects use git" pages.

The greater speed of Mercurial and Git -- even for pushing and pulling small patches -- likely will keep me away from Darcs for the moment.

Onwards...

As time allows (I do have other things keeping me busy), I plan to install git and work through some tutorials and try to use it in practice as much as possible, to get a good feel for it.

Future

It is beneficial to be using a VCS that is popular, though that is certainly not a major criterion for me. I refuse to use SVN because its lack of distributed functionality makes it too unproductive to be useful. But it looks like Git is gaining a lot of traction these days, especially in Debian circles, which also makes it more interesting.

I notice that Ted did convert e2fsprogs over to git as he said he might, incidentally.

Theodore Ts'o: Why I purchased the Sony PRS-505 Reader

Although I lot of people have been lauding the Kindle, I recently decided to go with the Sony PRS-505 instead. Yes, the Kindle has built-in EVDO access, and the ability to buy books without a computer, or even browse the web; and yes, the Sony has once again demonstrated it can’t create a compelling 21st century computer application to save its life. However, it had a few things that at least for me, made a better choice for me than the Kindle:

The Sony is thinner — I want to be able to slip it into my laptop case and have it take the absolute minimum amount of space.
The Sony simply looks much more elegant than the Kindle; steel with a leather cover simply looks a lot better than white, cheasy plastic.
I’m not interested in buying a lot of DRM’ed ebooks; ergo, I won’t be buying may books from either Sony or Amazon’s web sites. It is highly likely that within 2 years I will be buying a more advanced eBook reader, possibly one with color, and I don’t want to be locked into a single format where I have to go and repurchase all of my books just because some the latest and greatest eBook reader uses an incompatible DRM technology from whatever Sony or Amazon has used.
The Sony is $100 cheaper. Given that something better will be available within 2-3 years at the very most, and possibly sooner, I’m just not interested in spending $400 on a first generation prototype.
Perhaps the most important, the Sony has the really, really good open source support. Kovid Goyal’s libprs500 project supports the Sony PRS-500 and PRS-505, and has very good version tools, allowing people to convert eBooks previously stored in HTML, PDF, TXT, Microsoft Reader (.lit), IDPF/Open eBook (.epub) into Sony’s format. And with a little bit of work, it does a very, very good job with the conversion. Better yet, its ability to convert multiple HTML pages into a single eBook, with credible table of contents, means that libprs500 can pull down the New York Times, the Economist, etc., automatically format it into a single eBook which you can save onto your Sony Reader, and then read it while you are on the airplane. No muss, no fuss. I can also take various books that are available on the web as HTML and also convert them into an eBook which can be used by the Sony Reader very easily.

This last point is I believe one of the best reasons why the Sony Reader will be able to compete very successfully with the Kindle. The libprs500 software is written as a python application, and it will work on Windows, Linux, and MacOS — and its GUI user interface is far better than the truly pathetic Sony Connect software. Score one for Open Source! In my opinion, Sony should send a very nice gift certificate to Kovid as a thank you; his open source project has added an immeasurable amount of value to their product. The only thing that you can’t do using the libprs500 software is buy DRM’ed books which are locked to the Sony Reader — but that isn’t something that many people will be particularly interested in, I suspect. OK, I did buy Pillars of the Earth, which was available on the Sony site for $6 dollars — hmm, cheaper than Amazon’s $9.99 — but that was an investment I was willing to flush down the toilet when the PRS-505 becomes obsolete, mainly so I could test what buying a DRM’ed book would be like from the Sony web site. But I probably won’t be buying many books with DRM that way. On the other hand, I am quite willing to spend quite a bit more money on non-DRM’ed books from publishes such as Baen Books. Here’s to the hope to the publishing industry figures things out faster than RIAA’s member companies. In the meantime, I will be mostly pretending that the both the Sony and Amazon eBook stores with their proprietary DRM’ed books don’t exist…

Anthony Towns: tempus fugit

I blogged a fair bit about darcs some time ago, but since then I’ve not been able to get comfortable with the patch algebra’s approach to dealing with conflicting merges – I think mostly because it doesn’t provide a way for the user to instruct darcs on how to recover from a conflict and continue on. I’ve had a look at bzr since then, but it just feels slow, to the point where I tend to rsync things around instead of using it properly, and it just generally hasn’t felt comfortable. On the other hand, a whole bunch of other folks I respect have been a bit more decisive than I have on this, and from where I sit, there’s been a notable trend:

Keith Packard, Oct 2006: Repository formats matter, Tyrannical SCM selection
Ted Tso, Mar 2007: Git and hg
Joey Hess, Oct 2007: Git transitions, etckeeper, git archive as distro package format

Russell Coker: Perfect Code vs Quite Good Code

Some years ago I worked on a project where software reliability should have been a priority (managing data that was sometimes needed by the police, the fire brigade, and the ambulance service). Unfortunately the project had been tainted by a large consulting company that was a subsidiary of an accounting firm (I would never have expected accountants to know anything about programming and several large accounting firms have confirmed my expectations). I was hired to help port the code from OS/2 1.2 to NT 4.0. The accounting firm had established a standard practice of never calling free() because “you might call free() on memory that was still being used”. This was a terribly bad idea at the best of times and on a 16 bit OS with memory being allocated in 64K chunks the problems were quite obvious to everyone who had any programming experience. The most amusing example of this was a function that allocated some memory and returned a pointer which was being called as if it returned a boolean, one function had a few dozen lines of code similar to if(allocate_some_memory()). I created a second function which called the first, free’d any memory which had been allocated and then returned a boolean. Another serious problem with that project was the use of copy and paste coding. A section of code would perform a certain task and someone would need it elsewhere. Instead of making it a function and calling it from multiple places the code would be copied. Then one copy would be debugged or have new features added and the other copy wouldn’t. One classic example of this was a section of code that displayed an array of data points where each row would be in a colour that indicated it’s status. However setting a row to red would change the colour of all it’s columns, setting a row to blue would change all except the last, and changing it to green would change all but the second-last. The code in question had been copied and pasted to different sections with the colours hard-coded. Naturally I wrote a function to change the colour of a row and made it take the colour as a parameter, the program worked correctly and was smaller too. The next programmer who worked on that section of code would only need to make one change - instead of changing code in multiple places and maybe missing one. Another example of the copy/paste coding was comparing time-stamps. Naturally using libc or OS routines for managing time stamps didn’t occur to them so they had a structure with fields for the year, month, day, hours, minutes, and seconds that was different from every other such structure that is in common use and had to write their own code to compare them, for further excitement some comparisons were only on date and some were on date and time. Many of these date comparisons were buggy and often there were two date comparisons in the same function which had different bugs. I created functions for comparing dates and the code suddenly became a lot easier to read, less buggy, and smaller. I have just read an interesting post by Theodore Ts’o on whether perfect code exists [1]. While I understand both Theodore’s and Bryan’s points of view in this discussion I think that a more relevant issue for most programmers is how to create islands of reasonably good code in the swamp that is a typical software development project. While it was impossible for any one person to turn around a badly broken software development project such as the one I describe, it is often possible to make some foundation code work well which gives other programmers a place to start when improving the code quality. Having the worst of the memory leaks fixed meant that memory use could be analysed to find other bugs and having good functions for comparing dates made the code more readable and thus programmers could understand what they were looking at. I don’t claim that my code was perfect, even given the limitations of the data structures that I was using there was certainly scope for improvement. But my code was solid, clean, commented, and accepted by all members of the team (so they would continue writing code in the same way). It might even have resulted in saving someone’s life as any system which provides data to the emergency services can potentially kill people if it malfunctions. Projects based on free software tend not to be as badly run, but there are still some nasty over-grown systems based on free software where no-one seems able to debug them. I believe that the plan of starting with some library code and making it reasonably good (great code may be impossible for many reasons) and then trying to expand the sections of good code is a reasonable approach to many broken systems. Of course the ideal situation would be to re-write such broken systems from scratch, but as that is often impossible rewriting a section at a time often gives reasonable results.

[1] http://thunk.org/tytso/blog/2007/11/18/does-perfect-code-exist-abstractions-part-1/

Theodore Ts'o: Does perfect code exist? (Abstractions, Part 1)

Bryan Cantrill recently wrote a blog entry, where among other things, he philosophized on the concept of “perfect code”. He compares software to math, arguing that Euclid’s greatest common denominator algorithm shows no sign of wearing out, and that when code achieves perfection (or gets close to perfection), “it sediments into the information infrastructure” and the abstractions defined by that code becomes “the bedrock that future generations may build upon”. Later, in the comments of his blogs, when pressed to give some examples of such perfection, he cites a clever algorithm coded by his mentor to divide a high resolution timestamp by a billion extremely efficiently, and Solaris’s “cyclic subsystem”, a timer dispatch function. Watching his talk at Google where his introduction and sidebar book review on Scott Rosenberg’s “On Dreaming in Code”, it’s clear that he very passionately believes that it is possible to write perfect code, and that one should strive for that at all times. Perhaps that’s because he mostly writes code for operating systems, where the requirements change slowly, and for one OS in particular, Solaris, which tries far harder than most software projects to keep published interfaces stable for as long as possible. In contrast, the OS I’ve spent a lot of time hacking around, Linux, quite proudly states that at least inside the kernel, interfaces can not and should not be stable. Greg Kroah-Hart’s “Stable API Nonsense” is perhaps one of the strongly and most passionate expositions of that philosophy. I can see both sides of the argument, and in their place, both have something to offer. To Bryan’s first point, it is absolutely true that interfaces can become “bedrock” upon which entire ecosystems are built. Perhaps one of the most enduring and impactful example would be the Unix programming interface, which has since become enshrined by POSIX.2 and successor standards. I would argue, though, that it is the interface that is important, and not the code which initially implemented it. If the interface is powerful enough, and if it appears at the right time, and the initial implementation is good enough (not perfect!), then it can establish itself by virtue of the software which uses it becoming large enough that it assumes an important all out of scale with its original intention. Of course, sometimes such an interface is not perfect. There is an apocryphal story that when S. Feldman at AT&T labs first wrote the ‘make’ utility, that he did so rather quickly, and then put it available for his fellow lab members to use, and then went home to sleep. In some versions of the story he had stayed up late and/or pulled an all-nighter to write it, so he slept a long time. When he came back to work, he had come up with a number of ways to improve the syntax of the Makefile. Unfortunately, (so goes the story) that too many teams were already using the ‘make’ utility, so he didn’t feel he could change the Makefile syntax. I have no evidence that this ever took place, and I suspect it is an urban myth that was invented to explain why Makefiles have a rather ugly and unfortunate syntax that many would call defects, including the use of syntactically significant tab characters which are indistinguishable from other forms of leading whitespace. Another example which is the bane of filesystem designers everywhere are the Unix readdir(2), telldir(2), and seekdir(2) interfaces. These interfaces fundamentally assume that directories are stored in linear linked lists, and filesystems that wish to use more sophisticated data structures, such as b-trees, have to go to extraordinary lengths in order support these interfaces. Very few programs use telldir(2) and seekdir(2), but some filesystems such as JFS maintain two b-trees instead of one just to cater to telldir/seekdir. And yet, it is absolutely true that interfaces can be the bedrock for an entire industry. Certainly no matter what its warts, the Unix/Posix interface has proven the test of time, and it has been responsible for the success of many a company and many billions of dollars of market capitalization. But is this the same as perfect code? No, but if billions of dollars of user applications are going to be depending on that code, it’s best if code which implement such an interface be high quality, and should attempt to achieve perfection. But what does it mean for code to be perfect? For a particular environment, if the requirements can be articulated clearly, I can accept that code can reach perfection, in that it becomes as fast as possible (for the given computer platform), and it handles all exception cases, etc., etc. Unfortunately, in the real world, the environment and the requirements inevitably change over time. For example, Bryan’s cyclic subsystem, which he proudly touts as being if not perfect, almost so, and which executes at least 100 times a second on every Solaris system in the world. I haven’t looked at the cyclic system in any detail, since I don’t want to get myself contaminated (the CDDL and GPLv2 licenses are intentionally incompatible, and given that companies — including Sun — have sued over IPR issues, well, one can’t be too careful), but waking up the CPU from its low-power state 100 times a second isn’t a good thing at all if you are worried about energy conservation in data centers — or in laptops. For example, on my laptop, it is possible to keep wakeups down to no more than 30-35 times a second and it would be possible to do more but for an abstraction limitation. Suppose for example an process wants to be sleep and then receive a wakeup 1 milliseconds later, and so requests this via usleep(). At the same time, another application wants to sleep until some file descriptor activity takes place, or after 1.2 milliseconds takes place. 0.3 milliseconds later, a third process requests a sleep, this time for 0.8 milliseconds. Now, it could be that in all of the above cases, the applications don’t actually need exact timing; if they all get their wakeups plus or minus some fraction of a millisecond, they would be quite cool with that. Unfortunately, the standard timer interfaces have no way of expressing this, and so the OS can’t combine the three wakeups at T+1.0, T+1.1, and T+1.2 milliseconds into one wakeup at T+1.1ms. So this is where Greg K-H’s “Stable API Nonsense” comes into play. We may not be able to solve this problem at the userspace level, but we darn well can solve this problem inside the kernel. Inside the kernel, we can change the timer abstraction to allow device drivers and kernel routines to provide a timer delta plus a notion of how much accuracy is required for a particular timer request. Doing so might change a timer structure that previously external device drivers had depended upon — but too bad, that’s why stable ABI/API’s are not supported for internal kernel interfaces. Is this that the interface could have been extended? Well, perhaps, and perhaps not; if an interface is well designed, it is possible it can be extended in an API and/or ABI compatible way. There usually is a performance cost to doing so, and sometimes it may make sense to pay that that cost, and sometimes it may not. I’ll talk more about that in a future essay. Yet note what happened to the timer implementation. We have a pretty sophisticated timer implementation inside Linux, that uses heap data structures and buckets of timers for efficiency. So while I might not call it perfect, it is pretty good. But, oops! Thanks to this new requirement of energy efficiency, it will likely need to get changed to support variable levels of accuracy and the ability to fire multiple timers that are (more or less) coming due in a single CPU wakeup cycle. Does that make it no longer perfect, or no longer pretty good? Well, it just had a new requirement impact the code, and if criteria for perfection is for the abstraction defined by the code to be “bedrock” and never-changing, and no need to make any changes in said code over multiple years, I would argue that little to no code, even OS code, can ever achieve perfection by that definition — unless that code is no longer being used. (This is the first of a multi-part series of essays on abstractions that I have planned. The next essay which I plan to write will be entitled “Layer surfing, or jumping between layers”, and will be coming soon to a blog near you….)

Theodore Ts'o: I love it when things Just Work

I am currently in the Hilton Portland & Executive Tower hotel, and since I fly entirely too much, I got upgraded into a room which contains a printer. Thinking that I would try using it, I hooked it up to my laptop (running Ubuntu Gutsy), selected System->Administration->Printing on the desktop, and then clicked on New Printer. To my astonishment, when the dialog box came up, the system had already autodetected the fact that I had an HP OfficeJet KX60xi printer connected to the parallel port, had recommended which driver I should use, and a few “next” and “continue” clicks later, the printer was installed, and 15 seconds later I was able to print to it. Users of MacOS systems are probably used to such things, but this was faster and easier than what Windows asks of users who want to install a new printer. Coming from a Unix background, I would have been quite pleased if I was able to set up the printer after manually select the printer type and driver from a dialog box. Simply not having to su to root, edit some config files, and then restart some daemons, would be a major advance. But this completely exceeded my expectations. Well done to everyone in the CUPS and GNOME community who worked to make this possible!

Theodore Ts'o: Tip o the hat, wag o the finger Linux power savings for laptop users

It’s interesting to see how far, and yet how much more work we need to do on power management for Linux. I recently got a new laptop — a Lenovo Thinkpad X61s — and using the powertop tool, I was able to configure my system to the point where in what I can “airplane mail reading mode” (mailbox preloaded into memory, USB disabled, wireless and ethernet disabled, backlight down to 30% brightness, sloppily written power hogs like Firefox and Notes — every single application writer should be forced to run powertap and explain why their program feels it necessary to constantly wake up the CPU), I can get my usage down to about 9.8 watts. Using the 8 hours extended battery, that’s 8 hours of battery life, although granted doing very little. On the flip side, if I’m doing a major kernel compile, I can drive up utilization up to almost 30 watts, which means less than 3 hours of battery life. So that’s definitely the good news; Linux can sharply reduce its power consumption to the point where it is highly competitive with Windows. (And probably better than Vista, just because that OS is so heavy and bloated.) So thanks and and a tip of the hat to Intel and to Arjan van de Ven for making such a useful tool like Powertop available. So now for the bad news. Getting down to this level of power saving thriftness, where the laptop is carefully sipping only the minimal amounts of power from the battery, is definitely a work in progress. First of all, you can only get this level of power savings by unloading a specific USB driver, uhci_hcd. This will disable low speed devices (including unfortunately the fingerprint reader and the EVDO WWAN device if you were silly enough to buy one that was built into the laptop as opposed to a stand-alone card that you can swap between laptops and lend out to friends as necessary). But how many users are going to open up a terminal window, su to root and type the command “rmmod uhci_hcd”? And know how to reload the driver using “modprobe uhci_hcd” when they need to use the USB devices again? A similar problem exists for Network Manager; when the user disables the network by right-clicking on the applet, why doesn’t it automatically bring down the interface, instead of forcing the user to manually su to root and then type the command “ifconfig eth0 down; ifconfig wlan0 down”? A more serious problem is the Intel Wireless driver for the 4965. Even with the wlan0 interface configured down, and with the RF kill switch enabled, keeping the iwl4965 driver loaded will still cost you an extra full watt of power. When you’re down to 9.6 watts, that means that keeping the iwl4965 driver loaded when you don’t need it will cost you a 10% reduction in your battery life! That’s just sloppy, and hopefully it will be fixed in a future update to the iwl4965 driver, but as long as you don’t mind manually removing and reloading it, you can work around this power-saving oversight. A bigger issue, though, one for which no workaround exists, is that unlike the ipw3945 drivers, which at least had private, non-standard iwpriv commands to engage the 802.11’s power-saving features, the iwl4965 driver has neither the non-standard Intel iwpriv interfaces, nor the standard iwconfig interfaces for enabling any kind of powersaving features, including changing the transmit power of the card. So while powertop deserves plenty of kudos, iwl4965 deserves a wag of the finger from a power saving viewpoint. No doubt Intel just needs to allocate more money to its Open Source Technology Center so it get more of its crack developers to work improving Linux support for their processors and chipsets. Speaking of which, I’m still waiting for an Intel x.org 965GM driver that can support compiz/beryl and simultaneously show video clips at the same time… And being able turn off the 50 interrupts/second generated by the video card when they aren’t needed because 3-d graphics aren’t currently in use, without requiring a restart of the X server, would also be a nice touch. The bottom line is that Linux power savings and Linux support for laptops in general is much better than it was a year ago, and a lot of credit has to go to the efforts of Intel’s teams producing such good work as powertop, their wireless drivers, and their open X server drivers. We still have a lot of work left to be done, though!

Theodore Ts'o: Sous Vide, Revisited

In a previous post, I had recommended the 4130 NIST-Traceable Temperature Controller to control the temperature in a slow cooker. Unfortunately, that particular controller has a range that tops out at 60 degrees C / 140 degrees F, which is enough for cooking beef for long periods of time, but not enough for say, cooking duck confit, which for which a sous vide temperature of 80 degrees C is recommended. In addition, the 4130 is pretty expensive; almost $150. It’s possible to add a resistor to change the range of the 4130, but the temperature displayed by the controller is no longer correct, and you have manually create a conversion table between true temperature and the temperature as seen by the controller. I’ve recently come across a cheaper and better possibility, the Ranco ETC-111000-000 Temperature Controller which is only half the price and comes with a much larger working range (-30 to 220 degrees F). The price with the AC cord already wired in is $75; and the version which just has a 120VAC SPDT relay is only $60. A bit more about food safety. There has always been a lot of concern about bacteria growth and botulism, for good reason — and so therefore the recommendations for cooking temperature have a lot of safety margin in them — to the point now that the USDA recommends that steaks be cooked to at least 145 degrees F, which is well within what had traditionally been called “medium”, and chicken to at least 165 degrees F, which is enough to really destroy taste and texture. Sous vide cooking, especially some of the more low temperature variants, have raised a lot of concerns, to the point where a few years ago New York City (temporarily) banned it, causing a great outcry in the foodie community, since many top restaurants use sous vide techniques. First of all, any recommendation about internal temperatures and food safety that doesn’t also factor in time is massively oversimplifying the problem. Here is a table taken from “Food Safety Hazards and Controls for the Home Food Preparer”, published by the Hospitality Institute of Technology in 1994:

Temperature, F	Time, 5D kill	Time, 6.5D kill
130	86.42 minutes	112.34 minutes
135	27.33 minutes	35.53 minutes
140	8.64 minutes	11.23 minutes
145	2.73 minutes	3.55 minutes
150	51.85 seconds	1.12 minutes
155	16.40 seconds	21.32 seconds
160	5.19 seconds	6.74 seconds
165	1.64 seconds	2.13 seconds

This table lists the time to reduce bacteria concentrations of Salmonella and E. coli from 100,000 to 1 (5D) or 3,162,277 to 1 (6.5D). The FDA and USDA recommend cooking hamburger to 5D destruction. Since it is extremely unlikely for there to be more than 100 Salmonella organisms per gram of meat, a 5D kill will reduce Salmonella concentrations to no more than 1 organism per kilogram. So whether you cook a piece of meat to an internal temperature of 165 degrees for 1.64 seconds, or hold it at 130 degrees for 90 minutes, the effect on Salmonella and E. coli bacteria will be the same.Of course, one concern is that in many forms of cooking, particularly oven roasting and grilling, the temperature of the food as it heats up may not be even; so how do you guarantee that all parts of the food product has been brought up to the requisite 165 degrees? One way that recipe authors, fearful of liability concerns, have done so is to tell people to cook meat to much higher internal temperatures, to provide that extra safety margin at the expense of dessicated, horribly tasting turkey or chicken. But the advantage of sous vide cooking is by immersing the food in a water bath, the excellent heat conductivity of water helps guarantee that the entire body of meat will get raised to desired temperature relatively quickly. (How quickly depends on the thickness of the meat, obviously). So if you hold a roast beef that has been vacuum packed in a Foodsaver bag for five or six hours, there should be no question that all of the common bacteria has been inactivated, and that amount of time at 130 degrees F should be sufficient to inactivate 99.9 percent of all botulism toxin molecules (not there should be any in a fresh piece of meat, of course!) However, sous vide temperatures are not enough to kill bacterial spores, in particular C. botulinum, which is responsible for botulism. This requires temperatures far in excess of boiling water at sea level. For example, home canning protocols recommend holding the food product at 250 degrees F for at least 15 minutes. This is an issue in the restaurant business because very often food would be cooked sous vide, and then stored in the vacuum sealed bags for potentially weeks (yes, you could be eating an extremely expensive meal at a top-end French restaurant that had been cooked several weeks ago, and reheated just before serving; yummy, no?). If the food packages aren’t cooled quickly enough, and then allowed to warm to the danger zone, it’s possible that in the anerobic environment the C. botulinum spores could germinate and then start producing toxin. But, if you are cooking home sous vide where you are serving the food right after it has been cooked, this shouldn’t be a concern.

Theodore Ts'o: Hans Reiser, 20/20, and his talk at Google

I got a call from one of the researchers from ABC news this evening. Apparently they are planning on doing a segment on Hans Reiser on their 20/20 show, and the researcher knew enough that there were some serious technical inaccuracies with the script (at the level of “so when Hans was creating reiser4, was that software; was he writing a program?” and “does the program run on the hard drive?”), and so they were looking for help on some technical issues on a background basis. She didn’t take any quotes from me; what she needed was help understanding the technical issues around filesystems and Linux. I tried to help explain what a filesystem was that would make sense to a lay person in 15 seconds; I have no idea whether the researcher got it, and whether she’ll be able to make changes in the script that will be vaguely coherent. We’ll see. I spent a lot of time working with Joshua Davis, providing background material for his Wired article, and he still got a bunch of the technical details wrong. The 20/20 researcher did ask me some silly questions such as “whether we were surprised when murder charges were filed against Hans?” Well, yes. What did she expect me to say? “Oh, yes, were always worried someone would get hurt…” Not! Sigh…. I might not agree with Hans’s filesystem design principles or his tactics in trying to get reiser4 accepted, but I always respected him as a fellow open source programmer. The researcher mentioned to me that Hans had done a talk at Google that was available at the Google Video site. Out of curiosity, I took a look at it. It was interesting; the last time I had seen Hans was in 1999, at the Linux Storage Management Workshop in Darmstadt, Germany that was organized by Matthew O’Keefe at Sistina Software. Compared to how he looked back then, and the picture of him in the Wired article, taken a 10-12 months after his talk at Google in February, 2006, I was struck by how much heavier (and older at least compared to my memory of him, which was much closer to this picture from his resume) in the Google Video. I also remember him as being a much more dynamic and energetic speaker, and I was struck by how slowly he spoke, with lots of long pauses and “umms” and “ahhs”. He seemed to be a much better public speaker in Darmstadt, but in the Google talk he seemed very tired as he gave the talk. One of the questions that the researcher asked me was whether I thought he was a genius or not. I told her that I thought he was quite bright in terms of raw intelligence, but that his Social IQ wasn’t as high as you might want for someone to be successful in gathering volunteers to work on an Open Source project, and in working with others in an Open Source development community. Looking at the video, I think that is very much true. He was a terrible public speaker, but some of the points he made about optimizing B-tree algorithms made sense. I might disagree with his philosophy over filesystem design and benchmarking, and I might not be terribly impressed by his social skills, but in terms of being a talented computer scientist, he was and is that. Is he guilty of the crime that he has been accused of? I have no idea. But from looking at his talk and knowing what I know of him, I have the sense of a greek tragedy. He’s been working on some of his ideas since his undergraduate days in UC Berkely in the early eighties — which he entered after finishing his 8th grade. And as research ideas, I think he might have gotten some very interesting results out of trying some of the things that he wants to do around operating system namespaces. (I think they are doomed in a production system descended from Unix, since application programmers are unlikely to rewrite their programs to take advantage of reiserfs’s performance characteristics — which would mean instead of using a Unix-like configuration file, treating a directory hierarchy containing the configuration information like a Windows registry. Still, as an academic, Plan 9-like system, I could have seen it has being a potentially very interesting Systems Research effort.) Unfortunately, his skills at public speaking and his ability to work with other people have handicapped him, and I know that has frustrated him deeply. So I have a lot of sympathy for him, and I hope that he is innocent, and will be found innocent. But only time will tell…. Anyway, according to the researcher, they are currently scheduling their segment about Hans on 20/20 on October 19th. I’m sure that schedule is subject to change, but it’ll be interesting to see how they treat Hans in their coverage. Hopefully it will be fair, and not overly sensationalistic, but unfortunately my faith in today’s TV edutainment-focused news program isn’t terribly high. My impression was that the research wanted to do a good job, but she was burdened by a very tight deadline, and in the end, the decision of what goes on the air and what doesn’t won’t be up to her. So we’ll see.

Theodore Ts'o: How to properly support writers/artists?

Russell Coker, commenting on my last blog, and apparently after exploring some of the links stemming from the SFWA kerfuflle, apparently stumbled on a post from former SFWA VP Howard V. Hendrix, where he took the amazing position (for a SF writer) that he hated the using the internet, and that people who posted their stories on the web for free download were web-scabs, has taken the position that since such comments were an attack on our (Open Source Developer’s) community, that he would resolve “to not buy any more Sci-Fi books until I have read all the freely available books that I want to read”. Obviously, that’s his choice, but while I don’t have much respect for SFWA the organization, and certainly not for their choice in past and current vice presidents, there’s another side of the story here. First of all, Dr. Hendrix comments are not the official position of the SFWA, and there are many others who are SFWA members who would very strongly disagree with both the attitudes of Dr. Hendrix as well as the ham-handed DMCA pseudo-invocation by Dr. Burt. In addition, to quote Rick Cook:

The first thing you ve got to understand about the Science Fiction and Fantasy Writers of America is that it isn t. Like the Holy Roman Empire, which in Voltaire s phrase was neither holy, Roman nor an empire, SFWA is not an organization of science fiction and fantasy writers. While some of the leading SF and Fantasy writers belong, the vast majority of the members are people who barely meet SFWA s extremely lax publication requirements. They are not professional SF or Fantasy writers in any meaningful sense of the term and many of them haven t published a word of either science fiction or fantasy in years.

Secondly, there are plenty of Science Fiction writers that do really understand this issue quite well. In addition Rick Cook, whom I recommended in my last post, another example of a Science Function writer who has penned a very cogent series of articles about copyright, science fiction, and the business issues of being a SFF writer is Eric Flint. I strongly recommend his series, “Salvos Against Big Brother”,which includes a back-to-the basics examination of copyright quoting and reprinting two speeches by British Parliamentarian Thomas McCauley in 1841. Definitely worth a read, and again a demonstration that there exists Science Fiction authors that aren’t stuck in the dark ages; few (at least it is to be hoped) are like Dr. Hendrix. Eric Flint is also a senior editor for Baen Books (read more about the founder, Jim Baen here). Baen makes all of its titles available in e-book form without DRM, and many of its authors have agreed to make their books available completely free of charge. Eric Flint does so for all or most of his books shortly after they are published in mass-market paperback form; others only make a few of their books available, typically the first or second books in a series (in the hopes you will buy the rest of their books) — a wise strategy, as he explains in one of his Salvos Against Big Brother columns. More importantly, I strongly believe that if we enjoy an artist’s works, we should support the artist. That’s why I’ve directly reached out and given money to musicians, authors, and Debian release engineers. (Yes, that last was controversial, but to me and personal ethics, it’s all of the same piece.) Is patronage the right way to support musicians? Well, it’s one way, and I’ve always been fond of the “distributed patronage” model where we use the Internet to allow a large number of people to each contribute to support an artist’s work. The Big Meow is a good example how it might work. (By the way, to people who are wondering what is happening with The Big Meow — I have very recently pinged Diane, and she’s working on it. Between health and family emergencies, the last 12 months have thrown a lot of delays into her writing schedule.) Are there other models other than patronage that might work? Well, there is the traditional one — just buying the author’s books. But what if we don’t want a dead-tree copy and just want to be able to read it on our Irex Iliad, and the book wasn’t published by Baen Books, or one of the few enlightened publishers who make non-DRM’d eBooks available? That’s a harder question. Personally, I don’t find “Copyright Theft” immoral per se. Illegal, yes, but immoral only if I haven’t done something to materially support the author. If I’ve purchased a new copy of a book, and the eBook version isn’t available via legal means, I don’t believe it is immoral to download it from a site like scribd so I can read it on my laptop. Of course, that brings up other questions, such as what if the book is out of print (because the publisher don’t think it’s commercially viable to reissue the books), the author is dead, and the widow needs money? Lots of hard questions, and no good answers…. But in any case, I think it is the right thing to do to support those authors we care about as we can, and boyotting all SFF books isn’t necessarily appropriate or helpful.